Skip to content

Add GET /run/trace/{run_id} endpoint#272

Open
saathviksheerla wants to merge 4 commits intoopenml:mainfrom
saathviksheerla:feat/get-run-trace-endpoint
Open

Add GET /run/trace/{run_id} endpoint#272
saathviksheerla wants to merge 4 commits intoopenml:mainfrom
saathviksheerla:feat/get-run-trace-endpoint

Conversation

@saathviksheerla
Copy link
Contributor

@saathviksheerla saathviksheerla commented Mar 14, 2026

Description

Implements GET /run/trace/{run_id} as part of the run endpoints.

Fixes: #40
Related: #36

Matches PHP API behavior for error codes (571/572) and response shape.
Returns 412 with code 571 if run does not exist, 412 with code 572 if run
exists but has no trace, and trace rows on success.

Checklist

Please check all that apply. You can mark items as N/A if they don't apply to your change.

Always:

  • I have performed a self-review of my own pull request to ensure it contains all relevant information, and the proposed changes are minimal but sufficient to accomplish their task.

Required for code changes:

  • Tests pass locally
  • I have commented my code in hard-to-understand areas, and provided or updated docstrings as needed
  • I have added tests that cover the changes (only required if not already under coverage)

If applicable:

  • [N/A] I have made corresponding changes to the documentation pages (/docs)

Extra context:

  • This PR and the commits have been created autonomously by a bot/agent.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 14, 2026

Walkthrough

Adds two RFC 9457 problem-detail errors (RunNotFoundError, RunTraceNotFoundError) to src/core/errors.py. Introduces src/database/runs.py with async helpers exist(id_, expdb) and get_trace(run_id, expdb). Adds Pydantic schemas TraceIteration and RunTrace in src/schemas/runs.py. Adds a FastAPI router at GET /run/trace/{run_id} in src/routers/openml/runs.py and registers it in src/main.py. Adds unit and migration tests covering successful retrieval, missing trace, and missing run.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: implementing a new GET endpoint for run traces, which is the primary objective of the PR.
Description check ✅ Passed The description is clearly related to the changeset, explaining the implementation of the GET /run/trace endpoint with error handling and API behavior details.
Linked Issues check ✅ Passed The PR implements the GET /run/trace/{run_id} endpoint as specified in issue #40, with proper error handling (codes 571/572), response structure, and includes comprehensive tests validating the implementation.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing the GET /run/trace endpoint: error classes, database helpers, router setup, schemas, and comprehensive tests covering success and error cases.
Docstring Coverage ✅ Passed Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Mar 14, 2026

Codecov Report

❌ Patch coverage is 69.76744% with 13 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@f94808c). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/routers/openml/runs.py 55.55% 8 Missing ⚠️
src/database/runs.py 60.00% 4 Missing ⚠️
src/main.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #272   +/-   ##
=======================================
  Coverage        ?   54.31%           
=======================================
  Files           ?       37           
  Lines           ?     1528           
  Branches        ?      126           
=======================================
  Hits            ?      830           
  Misses          ?      696           
  Partials        ?        2           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@saathviksheerla saathviksheerla marked this pull request as ready for review March 14, 2026 05:50
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In database.runs.get, since the result is only used for existence checking, consider selecting a constant (e.g. SELECT 1) and using scalar_one_or_none() to reduce overhead and avoid returning an unused row object.
  • The RunTrace.run_id field uses Field(serialization_alias="run_id") even though the attribute name matches the alias; you can likely drop the alias for simplicity unless you specifically need different input/output names.
  • The selected=row.selected == "true" mapping assumes the database stores the value as the exact string "true"; consider normalizing/handling boolean or other truthy representations to make this more robust against schema or data differences.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `database.runs.get`, since the result is only used for existence checking, consider selecting a constant (e.g. `SELECT 1`) and using `scalar_one_or_none()` to reduce overhead and avoid returning an unused row object.
- The `RunTrace.run_id` field uses `Field(serialization_alias="run_id")` even though the attribute name matches the alias; you can likely drop the alias for simplicity unless you specifically need different input/output names.
- The `selected=row.selected == "true"` mapping assumes the database stores the value as the exact string "true"; consider normalizing/handling boolean or other truthy representations to make this more robust against schema or data differences.

## Individual Comments

### Comment 1
<location path="tests/routers/openml/runs_test.py" line_range="10-19" />
<code_context>
+
+
+@pytest.mark.parametrize("run_id", [34])
+async def test_get_run_trace_success(run_id: int, py_api: httpx.AsyncClient) -> None:
+    """Test that trace data is returned for a run that has trace entries."""
+    response = await py_api.get(f"/runs/trace/{run_id}")
+    assert response.status_code == HTTPStatus.OK
+    body = response.json()
+    assert body["run_id"] == run_id
+    assert isinstance(body["trace"], list)
+    assert len(body["trace"]) > 0
+    first = body["trace"][0]
+    assert "repeat" in first
+    assert "fold" in first
+    assert "iteration" in first
+    assert "setup_string" in first
+    assert "evaluation" in first
+    assert "selected" in first
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Strengthen success test by asserting types/values of key fields, not only their presence

Since the test only verifies key presence, it won’t catch type regressions from the DB layer to the API schema. Please also assert that `selected` is a boolean and that `evaluation` is either `None` or numeric, e.g.:

```python
assert isinstance(first["selected"], bool)
assert (first["evaluation"] is None) or isinstance(first["evaluation"], (int, float))
```
This helps detect accidental changes in the response shape earlier.
</issue_to_address>

### Comment 2
<location path="tests/routers/openml/runs_test.py" line_range="28-33" />
<code_context>
+
+
+@pytest.mark.parametrize("run_id", [24])
+async def test_get_run_trace_no_trace(run_id: int, py_api: httpx.AsyncClient) -> None:
+    """Test that 412 is returned for a run that exists but has no trace."""
+    response = await py_api.get(f"/runs/trace/{run_id}")
+    assert response.status_code == HTTPStatus.PRECONDITION_FAILED
+    body = response.json()
+    assert body["code"] == "572"
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Extend `no_trace` test to validate the full problem details shape, not just the code

In addition to status and error `code`, please assert key fields from the problem details payload (e.g. `type`/`title`/`status`) so the error contract is fully covered and accidental changes are caught earlier. For example:

```python
assert body["status"] == HTTPStatus.PRECONDITION_FAILED
assert body["type"].endswith("/problems/run-trace-not-found")
assert body["title"] == "Run Trace Not Found"
```

It would also be helpful to mirror this in the `run_not_found` test using the corresponding `RunNotFoundError` values.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@saathviksheerla saathviksheerla marked this pull request as draft March 14, 2026 05:54
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/database/runs.py (1)

30-33: Add explicit trace ordering for deterministic responses.

The trace query currently relies on DB default row order. Please sort explicitly so client output is stable and reproducible.

Proposed fix
             SELECT `repeat`, `fold`, `iteration`, `setup_string`, `evaluation`, `selected`
             FROM `trace`
             WHERE `run_id` = :run_id
+            ORDER BY `repeat`, `fold`, `iteration`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/database/runs.py` around lines 30 - 33, The trace SELECT in
src/database/runs.py relies on unspecified DB ordering; update the query that
selects `repeat`, `fold`, `iteration`, `setup_string`, `evaluation`, `selected`
for a given `run_id` to include an explicit ORDER BY (e.g., ORDER BY `repeat`,
`fold`, `iteration`, `selected`) so results are deterministic and reproducible;
locate the SQL string used to fetch traces and append the ORDER BY clause to it
(keeping the WHERE `run_id` = :run_id filter intact).
tests/routers/openml/runs_test.py (1)

18-25: Strengthen success-path assertions for selected.

Please assert the accepted selected values, not just key presence, so contract regressions are caught.

Proposed fix
     first = body["trace"][0]
@@
     assert "selected" in first
+    assert first["selected"] in {"true", "false"}

Based on learnings: enum-like values in this API should be exposed as literal strings ("true"/"false"), so tests should verify that explicitly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/routers/openml/runs_test.py` around lines 18 - 25, The test currently
only checks presence of the "selected" key on the first trace entry; update the
assertion to verify the value is one of the literal strings "true" or "false" so
regressions are caught—locate the variable first = body["trace"][0] in
tests/routers/openml/runs_test.py and replace the key-presence assert for
"selected" with an assertion that first["selected"] is in the allowed set
{"true", "false"} (i.e., explicitly check the string value, not just existence).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/routers/openml/runs.py`:
- Line 40: The response mapping currently coerces the DB enum-like literal by
doing selected=row.selected == "true"; instead, preserve the original DB value
and pass row.selected through unchanged (replace the boolean coercion with the
literal), updating the mapping where "selected" is assigned (the expression
using row.selected) so the API shape remains the original string
("true"/"false").

In `@src/schemas/runs.py`:
- Line 14: The `selected` field in the runs schema is currently typed as bool
but must preserve literal string values "true"/"false"; change its type to
typing.Literal["true", "false"] (or str with a validator enforcing those exact
values) in the schema definition (the `selected` field in the model declared in
src/schemas/runs.py) and add a Pydantic validator if needed to ensure only
"true"/"false" are accepted and returned as strings so the response contract is
preserved.

---

Nitpick comments:
In `@src/database/runs.py`:
- Around line 30-33: The trace SELECT in src/database/runs.py relies on
unspecified DB ordering; update the query that selects `repeat`, `fold`,
`iteration`, `setup_string`, `evaluation`, `selected` for a given `run_id` to
include an explicit ORDER BY (e.g., ORDER BY `repeat`, `fold`, `iteration`,
`selected`) so results are deterministic and reproducible; locate the SQL string
used to fetch traces and append the ORDER BY clause to it (keeping the WHERE
`run_id` = :run_id filter intact).

In `@tests/routers/openml/runs_test.py`:
- Around line 18-25: The test currently only checks presence of the "selected"
key on the first trace entry; update the assertion to verify the value is one of
the literal strings "true" or "false" so regressions are caught—locate the
variable first = body["trace"][0] in tests/routers/openml/runs_test.py and
replace the key-presence assert for "selected" with an assertion that
first["selected"] is in the allowed set {"true", "false"} (i.e., explicitly
check the string value, not just existence).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2084608b-d09f-4e37-af42-5c84b9ffd054

📥 Commits

Reviewing files that changed from the base of the PR and between f94808c and dd83fea.

📒 Files selected for processing (6)
  • src/core/errors.py
  • src/database/runs.py
  • src/main.py
  • src/routers/openml/runs.py
  • src/schemas/runs.py
  • tests/routers/openml/runs_test.py

@saathviksheerla saathviksheerla marked this pull request as ready for review March 14, 2026 06:18
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In database.runs.get, consider returning a simple boolean (or using scalar_one_or_none() with SELECT 1 ... LIMIT 1) instead of a Row | None, since the query is only used as an existence check and this would make the intent and typing clearer.
  • The get_trace function currently returns a generic Sequence[Row]; if you add a typed Protocol/TypedDict or a small data class for the expected columns, you can get better type safety and make the mapping in get_run_trace more robust to schema changes.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `database.runs.get`, consider returning a simple boolean (or using `scalar_one_or_none()` with `SELECT 1 ... LIMIT 1`) instead of a `Row | None`, since the query is only used as an existence check and this would make the intent and typing clearer.
- The `get_trace` function currently returns a generic `Sequence[Row]`; if you add a typed `Protocol`/`TypedDict` or a small data class for the expected columns, you can get better type safety and make the mapping in `get_run_trace` more robust to schema changes.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/routers/openml/runs_test.py (1)

11-41: Reduce seed-data coupling in endpoint tests.

These tests depend on fixed seeded IDs (Line 11, Line 28, Line 40). Consider arranging run/trace rows inside the test transaction (or via fixtures) so they stay deterministic if seed SQL changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/routers/openml/runs_test.py` around lines 11 - 41, The tests
test_get_run_trace_success, test_get_run_trace_no_trace (and the parametrized
case in test_get_run_trace_run_not_found) are coupled to seeded run IDs (34, 24,
999999); instead insert or create the required run and trace rows inside the
test or via a fixture (transaction-scoped or test-scoped factory) and use the
created run_id in the GET call to /runs/trace/{run_id}, or create a run without
traces for the PRECONDITION_FAILED case; ensure cleanup by using the test DB
transaction/fixture so tests remain deterministic if seed SQL changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/routers/openml/runs_test.py`:
- Around line 11-41: The tests test_get_run_trace_success,
test_get_run_trace_no_trace (and the parametrized case in
test_get_run_trace_run_not_found) are coupled to seeded run IDs (34, 24,
999999); instead insert or create the required run and trace rows inside the
test or via a fixture (transaction-scoped or test-scoped factory) and use the
created run_id in the GET call to /runs/trace/{run_id}, or create a run without
traces for the PRECONDITION_FAILED case; ensure cleanup by using the test DB
transaction/fixture so tests remain deterministic if seed SQL changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 74f66776-f5ed-43cc-ad45-a5f9b1a40f25

📥 Commits

Reviewing files that changed from the base of the PR and between dd83fea and 966ba52.

📒 Files selected for processing (4)
  • src/database/runs.py
  • src/routers/openml/runs.py
  • src/schemas/runs.py
  • tests/routers/openml/runs_test.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/database/runs.py
  • src/schemas/runs.py

@saathviksheerla saathviksheerla changed the title Add GET /runs/trace/{run_id} endpoint Add GET /run/trace/{run_id} endpoint Mar 15, 2026
Copy link
Contributor

@PGijsbers PGijsbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to contribute! The changes look good though I did leave a remark to update one function. In addition, I want tests added that compare the output of the new Python-based implementation to that of the old PHP responses.
Please add a test module under tests/routers/openml/migration that compares the responses. It should make sure that we have known "translations" for the runs on the test server, e.g., comparison the responses there can be differences (e.g., error handling is different), but the test clearly outlines those differences and forces equality otherwise. See for example the tasks_migration_test.py file.

Comment on lines +10 to +23
async def get(id_: int, expdb: AsyncConnection) -> Row | None:
"""Check if a run exists by ID."""
row = await expdb.execute(
text(
"""
SELECT 1
FROM `run`
WHERE `rid` = :run_id
""",
),
parameters={"run_id": id_},
)
return row.one_or_none()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
async def get(id_: int, expdb: AsyncConnection) -> Row | None:
"""Check if a run exists by ID."""
row = await expdb.execute(
text(
"""
SELECT 1
FROM `run`
WHERE `rid` = :run_id
""",
),
parameters={"run_id": id_},
)
return row.one_or_none()
async def exist(id_: int, expdb: AsyncConnection) -> bool:
"""Check if a run exists by ID."""
row = await expdb.execute(
text(
"""
SELECT 1
FROM `run`
WHERE `rid` = :run_id
""",
),
parameters={"run_id": id_},
)
return bool(row.one_or_none())

I saw that sourcery suggested to use SELECT 1. That is fine, but at this point the function doesn't do that the function name says it does. Update callsites accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! Will rename get to exist returning bool and update the callsite. Will also add a migration test under tests/routers/openml/migration/ comparing PHP and Python responses for known runs.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/routers/openml/migration/runs_migration_test.py (2)

38-42: ignore_order=True may mask ordering regressions.

If trace row order is contractually meaningful, this can hide real mismatches (Line 41). Consider sorting both payloads by (repeat, fold, iteration) first, then compare with order-sensitive diff.

Proposed adjustment
-    differences = deepdiff.diff.DeepDiff(
-        new_json,
-        php_response.json(),
-        ignore_order=True,
-    )
+    def _sorted_trace(payload: dict) -> dict:
+        copied = payload.copy()
+        copied["trace"] = copied["trace"].copy()
+        copied["trace"]["trace_iteration"] = sorted(
+            copied["trace"]["trace_iteration"],
+            key=lambda row: (int(row["repeat"]), int(row["fold"]), int(row["iteration"])),
+        )
+        return copied
+
+    differences = deepdiff.diff.DeepDiff(
+        _sorted_trace(new_json),
+        _sorted_trace(php_response.json()),
+        ignore_order=False,
+    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/routers/openml/migration/runs_migration_test.py` around lines 38 - 42,
The current DeepDiff call uses ignore_order=True which can hide ordering
regressions; instead, sort both payload lists by the tuple keys (repeat, fold,
iteration) before diffing: extract the trace rows from new_json and
php_response.json(), sort each list using a key function that returns
(row['repeat'], row['fold'], row['iteration']) (or safe-access those fields),
then call deepdiff.DeepDiff on the sorted lists without ignore_order=True
(update the DeepDiff invocation where differences = deepdiff.diff.DeepDiff(...),
and reference new_json, php_response.json(), and the differences variable).

13-13: Consider naming the staged run-id intent explicitly.

34 is a magic value right now (Line 13). Adding an ids= label or a named constant would make fixture/staging intent clearer during failures.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/routers/openml/migration/runs_migration_test.py` at line 13, The
parametrize uses a magic literal for run_id (pytest.mark.parametrize("run_id",
[34])); replace it with an explicit intent label by either defining a named
constant (e.g., STAGED_RUN_ID = 34 and parametrize with [STAGED_RUN_ID]) or
adding an ids parameter to the parametrize call (e.g., ids=["staged-run-34"]) so
failures clearly show the staged run intent; update references to run_id in the
test to use the chosen constant or keep the existing variable name if using ids.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/routers/openml/migration/runs_migration_test.py`:
- Around line 38-42: The current DeepDiff call uses ignore_order=True which can
hide ordering regressions; instead, sort both payload lists by the tuple keys
(repeat, fold, iteration) before diffing: extract the trace rows from new_json
and php_response.json(), sort each list using a key function that returns
(row['repeat'], row['fold'], row['iteration']) (or safe-access those fields),
then call deepdiff.DeepDiff on the sorted lists without ignore_order=True
(update the DeepDiff invocation where differences = deepdiff.diff.DeepDiff(...),
and reference new_json, php_response.json(), and the differences variable).
- Line 13: The parametrize uses a magic literal for run_id
(pytest.mark.parametrize("run_id", [34])); replace it with an explicit intent
label by either defining a named constant (e.g., STAGED_RUN_ID = 34 and
parametrize with [STAGED_RUN_ID]) or adding an ids parameter to the parametrize
call (e.g., ids=["staged-run-34"]) so failures clearly show the staged run
intent; update references to run_id in the test to use the chosen constant or
keep the existing variable name if using ids.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f0eab104-d74c-4a58-845b-cc4f1a06ad48

📥 Commits

Reviewing files that changed from the base of the PR and between 9376c1c and 0fb3dda.

📒 Files selected for processing (3)
  • src/database/runs.py
  • src/routers/openml/runs.py
  • tests/routers/openml/migration/runs_migration_test.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/routers/openml/runs.py
  • src/database/runs.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GET /run/trace/{id}

2 participants